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SUMMARY 

Variational data assimilation technique applied to the identification of the optimal discretization 
of interpolation operators and derivatives in the nodes adjacent to the boundary of the domain is 
discussed in frames of the linear shallow water model. The advantage of controlling the discretization 
of operators near boundary rather than boundary conditions is shown. Assimilating data produced by 
the same model on a finer grid in a model on a coarse grid, we have shown that optimal discretization 
allows us to correct such errors of the numerical scheme as under-resolved boundary layer and wrong 
wave velocity. 
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1. Introduction 

It is now well known, that even the best model is not sufficient to make a good forecast. 
Any model depends on a number of parameters, requires initial and boundary conditions and 
other data that must be collected and used in the model. However, interpolating or smoothing 
observed data is not the best way to incorporate these data into a model. Lorenz, in his 
pioneer work [1] has shown that a geophysical fluid is extremely sensitive to initial conditions. 
This fact requires to bring the model and its initial data together, in order to work with the 
couple " model-data" , and to identify the optimal initial data for the model taking into account 
simultaneously the information contained in the observational data and in the equations of 
the model. 

Optimal control methods [5] and perturbations theory [3J applied to the data assimilation 
technique ([4], [5]) show the way to do it. They allow to retrieve an optimal initial state for a 
given model from heterogeneous observation fields. Since the early 1990's, many mathematical 
and geophysical teams have been involved in the development of the data assimilation strategy. 
One can cite many papers devoted to this problem, as in the domain of development of different 
techniques for the data assimilation and in the domain of its applications to the atmosphere 
and oceans. 
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However, overwhelming majority of data assimilation methods are now intended to identify 
and reconstruct an optimal initial state for the model. Since Lorenz pQ, who has pointed out 
the importance of precise knowledge of the starting point of the model, essentially the starting 
point is considered as the control parameter and the target of data assimilation. 

Of course, the model's flow is extremely sensitive to its initial point. But, it is reasonable 
to suppose that a geophysical model is also sensitive to many other parameters like bottom 
topography, boundary conditions on rigid and open boundaries, forcing fields and friction 
coefficients. All these parameters and values are also extracted in some way from observational 
data, interpolated to the model's grid and can neither be considered as exact, nor as optimal 
to the model. On the other hand, due to non-linearity and intrinsic instability of model's 
trajectory, its sensitivity to all these external parameters may also be exponential. 

Numerous studies show strong dependence of the model's flow on the boundary data ([5J, 
[7]), on the representation of the bottom topography ([8], [9], [10]), on the wind stress f[TT]. 
[12]), on diffusivity coefficients ([13]) and on fundamental parametrization like Boussincsq 
and hydrostatic hypotheses [H]. However, few papers are devoted to the development of data 
assimilation techniques intended to identify and to control these internal model's parameters. 
One can cite several attempts to use data assimilation in order to identify the bottom 
topography of simple models (|15). [16j ) and in order to control open boundary conditions 
in coastal and regional models ([E], [IE], US)- Boundary conditions on rigid boundaries have 
been controlled by data assimilation for heat equation (see for example [20], [21]), but this 
control concerns the linear parabolic diffusion operator rather than hyperbolic transport and 
advection operators that are more important in geophysical models. 

Studies of the possibility to control boundary conditions on rigid boundaries for equations 
containing hyperbolic operators can be found in [22j on the example of non-linear balance 
equation, in [23] on the example of the wave equation and in |24j and [25) on the example of 
the Burgers equation. The principal possibility to improve the model's solution controlling it's 
boundary values are shown in all these papers. However, as it has been noted in [24] . particular 
attention must be paid to the discretization process which must respect several rules because it 
is the discretization of the model's operators takes into account the set of boundary conditions 
and introduces them into the model. Consequently, instead of controlling boundary conditions 
themself, it may be more useful to identify optimal discretization of differential operators 
in points adjacent to boundaries. This allows us to control directly the way the boundary 
conditions influence the model and to control boundary parameters in a more general way. 
In [53], for example, it was shown that deplacing the boundary helps to correct the error 
occured due to a wrong wave velocity. Adjustment of the boundary position is, however, only 
possible when the coefficients used in approximations of the derivatives are controlled directly 
by data assimilation. Boundary conditions participate in discretized operators, but considering 
the discretization itself, we take into account additional parameters like the position of the 
boundary, lack of resolution of the grid, etc. 

Although the boundary configuration of the ocean is steady and can be measured with 
much better accuracy than the model's initial state, it is not obvious how to represent it 
on the model's grid because of limited resolution. The coastal line of continents possesses a 
very fine structure and can only be roughly approximated by the model's grid. Consequently, 
boundary conditions are defined at the model grid's points which are different from the coast. 
Even the most evident impermeability condition being placed at a wrong point may lead to 
some error in the model's solution. We should accept the flux to be able to cross the boundary 
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in places where the boundary is in water, prescribing some integral properties on the flux. 

Moreover, ocean models frequently include strong and thin boundary currents with intense 
velocity gradients. In this case particular attention must be payed to the approximation of 
the boundary layer because wrong representation of these currents may be responsible for 
drastic deformations in a global solution (see, for example [6]). This may lead us to control the 
discretization of the model's operators in the whole boundary layer rather than in adjacent to 
boundary nodes only. 

In this paper we use 4D-Var data assimilation to control the discretization of derivatives 
and interpolation operators in the boundary regions. The development of the data assimilation 
is illustrated on the example of the linear shallow water model on the Arakawa's C-grid. 
The simplicity of the model allows us to clearly see technical points of the development 
(like the algorithm of differentiation and development of the adjoint equation) without being 
overwhelmed by complexity of operators and grids. The purpose of the paper is to study the 
possibility to control the numerical scheme by data assimilation and the particularities of this 
type of control in view to develop and use the data assimilation to identify optimal numerical 
scheme in coastal regions of ocean models. 

The paper is organized as follows. The second section describes the model, its adjoint and 
the data assimilation procedure. The third section is devoted to the stationary solution of 
the model that includes the Munk boundary layer. And the fourth section discusses the data 
assimilation in the case of inertia-gravity and Rossby waves. 



2. Linear Shallow Water Model 
2.1. Model's equations and discretization 

In order to test the data assimilation procedure we consider the linear shallow water model on 
the /3-plane [26]. [27]: 

du Qtj t 

% + {h + Pv)u = -g^-*v + »Av+^- (1) 
at dy p H 

dt dx dy 

where u(x, y, t) and v(x, y, t) are two velocity components, rj{x 1 y, t) is the sea surface elevation, 
po is the mean density of water, Hq is the characteristic depth of the basin and g is the 
reduced gravity. The model is driven by the surface wind stress with components T x (x,y) and 
T y (x, y) and subjected to the bottom drag that is parametrized by linear terms au, av and to 
the horizontal eddy diffusion parametrized by an harmonic operator fiAu and /iAv. Coriolis 
parameter is supposed to be linear in y coordinate and is presented as (fo + Py). The system is 
defined in a square box of side L with boundary conditions prescribed for u and v. We require 
that both u and v vanish on the whole boundary. No boundary conditions is prescribed to r\. 
We discretize all variables of this equation on the regular Arakawa's C-grid with constant 
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Figure 1. Arakawa C-grid 



grid step h = -It in both x and y directions (see fig[T]) 
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v(ih-h/2,jh) ioi i = 0,...N + = 0,...N 
r)(ih-h/2,jh-h/2) for i = 0, . . . AT + 1, j = 0, . 



.iV + 1 



The system (JTJ contains several spatial operators that must be approximated on this grid. 
First of all, this is the divergence operator in the third equation. We shall note the discretization 
of derivatives in the divergence by D x u and D y v. Second, the gradient operator that is present 
in the first two equations is denoted by G x rj, G y rj. Third, two interpolation operators are 
necessary to calculate variables u and v at points where v and u are defined respectively. 
These operators are composed of two subsequent interpolations: S x u and S y v that provide 
their results at points where 77 is defined, and S y r) and S x rj that give the resulting interpolation 
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at required points. It should be noted that operators S v r] and S x r] are never applied to the 
variable r\ itself, but to the results of interpolation S x u and S y v which are defined at the same 
points as 77. And finally, the fourth operator is the discrete Laplacian that is denoted by A h . 
Using these notations, the discretized system ([I} can be written 

-57 = (/o + (3y)S x S y v - gG x r] - <ru + (iA h u + 



^ = -(fo + (3y)S y S x u-gG y r,-av + fxA h v + ^ (2) 

drj 
~dt 



-H (D x u + D y v) 



We define discretized operators D x u, D y v,G x r],G y r], S x u, S y v, S y i] and S X T) at all internal 
points of the domain as linear combination of the variable's values at four adjacent grid points. 
For example, the derivative and the interpolation of the variable u defined at corresponding 
points writes 



1 1 



{D x u) l _ 1/2j _ 1/2 = Ok u i+k,j-i/2 for i = M + 1, . . . ,N - M 



fc=-2 



I 1 

(S x V,) i _ 1 / 2 ,j-l/2 = T X] a k U i+k,j-l/2 for i = M + 1, 



h 

k=-2 



(3) 

,N-M 



That means the derivative at the point i — 1/2 is presented as linear combination of four values 
of u taken at points from i — 2 to i + 1 with coefficients a® . 

Coefficients au are supposed to be known because we intend to control the approximations 
near the boundary only. In this paper, we use either the sequence a;P = i(0, — 1,1,0) 

for k — —2,-1,0,1, or a® — (1, — 27, 27, — 1). One can easily see that corresponding 
approximations of derivatives are of the second and of the fourth order 



U t - Uj-i 

h 

Uj-2 - 21Uj-i + 21Uj - U i+ i 

To discretize interpolation operators, we use either of = (0,1/2,1/2,0) for k = 

(—2,-1,0,1), or af — ^(—1,9,9,-1) that also provide the second and the fourth order 
interpolation of the variable. 

All other operators are discretized similarly at all internal points. All derivatives use the 
same sequence af? and all interpolation operators are discretized with the same sequence of. 
We shall note in each particular case whether the second or the fourth order scheme is used. 

However, operators D and S are not defined in the boundary region that is considered as 
a band of M grid nodes width that follows the boundary. Discretizations of operators in this 



du 
dx 

du 
dx 



i-l/2 



i-1/2 



ft* 

24 



(4) 



640 V dx 5 y.._ 1/2 



Copyright © 200 John Wiley & Sons, Ltd. 
Prepared using fldauth.cls 



Int. J. Numer. Meth. Fluids 200; 0:1-34 



6 



E. KAZANTSEV 



band are supposed to be different from (J3]) and represent the control variables in this study. 
Moreover, schemes ([3]) can not be used at all for the fourth order approximation because 
they require function's values beyond the boundary. We can, of course, extrapolate variables 
beyond the domain with the necessary order and substitute extrapolated values in ([3]), but it 
is not obvious what extrapolation formula is the best for this purpose. So, in order to obtain 
an optimal boundary approximation assimilating external data, we suppose nothing about 
derivatives near the boundary and write them in a general form 

(A^) m -i/2,i-i/2 = ~ [o%£ + a k^ u k-x,3-i/^ fOT m = 1,2, . . . , M (5) 

This formula represents also a linear combination of values of u at K points adjacent to the 
boundary, but with the coefficients a/a which are considered as particular for each operator 
and for each variable. The constant Q.q xU , which is also particular for each operator, may be 
added in some cases to simulate non uniform boundary conditions like u(0,y) — a^ xM =/= 0. 
All coefficients a are considered as control parameters in this study. They are allowed to vary 
in the variational assimilation procedure intended to identify their optimal set. The number 
of points K used in linear combination is not fixed. We keep the possibility to vary K and to 
study its influence on the assimilation results. In addition to this, we can control derivative and 
interpolation operators at M points adjacent to the boundary. The vicinity of the boundary 
is not restricted to just one point, we use the parameter M to define how many points near 
the boundary will be controlled by the algorithm. 

Moreover, we can require looking for particular a for each point near the boundary. That 
means we may allow the dependence of ao, ak, m on the latitude j in the expression ([5]). The 
question of using cither uniform coefficients or particular a for each latitude will be discussed 
later in this paper. 

Here, we can emphasize the choice of controlling the numerical scheme in the boundary 
region rather than boundary conditions. The general form of boundary conditions that may 
be prescribed for u variable in this model is 

du 

u{t) \ boundary -ri — (t) =r 2 . (6) 

boundary 

We can not write more complex boundary conditions (with second derivatives, for example) 
because we obtain a system with no solution at all. Consequently, we can control only two 
parameters, r\ and r-x- It may be sufficient in particular cases, but, as we shall see further, is 
not sufficient in general. Controlling all coefficients of the numerical scheme (O, we are free to 
choose as many ak >m as we need defining appropriate values of parameters K and M . 

We distinguish a for different variables and different operators allowing different controls of 
derivatives because of the different nature of these variables and different boundary conditions 
prescribed for them. It is obvious, for example, that the approximation of the derivative of 
r\ in the gradient may differ from the approximation of the derivative of u in the divergence. 
Although both operators represent a derivative, boundary conditions for u and 77 are different, 
these derivatives are defined at different points, at different distance from the boundary. 
Consequently, it is reasonable to let them be controlled separately and to assume that their 
optimal approximation may be different with distinct coefficients a 5 *" and a DlV . 

Interpolations and derivatives on the opposite side of the square are also assumed to be 
different. They are calculated with coefficients a which are also considered as unknown control 
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parameters: 

K 



(D x u) N _ m+1/2 j-l/2 = -r(&o,m + ^2 & k,m U N-k+l/2,j-l/2j for m = 1,2, 



k=0 



Time stepping of this model is performed by the leap-frog scheme. The first time step is 
splitted into two Runge-Kutta stages in order to ensure the second order approximation. 



2.2. Tangent and adjoint models 

The approximation of the derivative introduced by ([3|) and ([5|) depends on control variables 
a. Operators are allowed to change their properties near boundaries in order to find the best 
fit with requirements of the model and data. To assign variables a we shall perform data 
assimilation procedure and find their optimal values. Variational data assimilation is usually 
performed by minimization of the specially introduced cost function. The minimization is 
achieved using the gradient of the cost function that is usually determined by the run of the 
adjoint to the tangent linear model. 

We start from development of the tangent linear model that is equal to the Gateaux 
derivative of the original model ^ with respect to the control parameters. To calculate this 
derivative we suppose that the control variables can have small variations and we determine 
how these variations will perturb the model's solution. Thus, we suppose that all a are replaced 
by some a + 5a such that \\Sa\\ << \\a\\. Let the model with a + Sa have a new solution 
(u + 8u, v + 8v, ?/ + 5r)). We get the equation for 5u, Sv, dp as the difference between the 
model with coefficients a + Sa and the model with coefficients a. To obtain this equation, we 
introduce variations of operators. Using the gradient G as an example, we denote the difference 
G(a + Sa) - G(a) by 5G and get 

G(a + 6a)(r] + Srj) - G(a) v = {SG)?] + GSn + {50)5^ 

As it was defined, G is a finite difference approximation of a derivative. This matrix is block- 
bidiagonal (quadridiagonal if the fourth order scheme is used), i.e. each block is composed of 
two or fourth diagonals filled by coefficients a® from ([3]). However, M lines in the beginning and 
M lines at the end of each block have a different structure. They are intended to approximate 
the gradient near the boundary and count each K non-zero elements a as defined in ([5]). It 
is these lines only that depend on the control parameter and there are non-zero elements Sa 
in the matrix SG on these lines only. All other lines of SG are filled by zeroes because the 
approximation of the derivative in the interior part of the domain does not depend on the 
controls a. 

So far, both SG and Srj are supposed to be small, we neglect their product in the right-hand- 
side of this equation. Variation of the divergence operator is written in a similar way with 
already neglected nonlinear term: 

D x (a + 5a){u + Su) — D x (a)u = (8D x )u + D x 5u 

The interpolation operator in ([2|) is composed by two successive interpolations. The linear part 
of it's variation counts, consequently, three terms 

^(q + 8a)Sy{a + Sa)(v + Sv) — S x (a)S v (a)v = (SS x )S y v + S x (SS y )v + S x S y Sv 



Copyright © 200 John Wiley & Sons, Ltd. 
Prepared using fldauth.cls 



Int. J. Numer. Meth. Fluids 200; 0:1-34 



8 



E. KAZANTSEV 



All these matrices (SS X , SS y , 5D X , 5D y etc.) have a similar structure. Their only non-zero 
elements are situated in the beginning and at the end of each block. 

Combining variances of approximations of all differential and interpolation operators, we 
write the system that governs the evolution of perturbations of u, v and r\: 

{fo + /3y)S x S y 5v — gG x 5r] — aSu + fiA h 5u + 
{fo + Py){SS x S y v + S x SS y v) - gSG x i] 

-{fo + f3y)S y S x 6u - gG y Sr] - aSv + fiA h Sv - (7) 
{fo + (3y){SS y S x u + S y SS x u) - gSG y r/ 

-H (D x 5u + D y 8v) - H (6D x u + SD y v) 



with the same zero boundary conditions for Su and Sv as for u and v. No boundary conditions 
is prescribed for Srj as well as for rj. At initial time both Su, Sv and St] vanish because our study 
is confined at the control of the discretization of operators near the boundary only rather than 
joint control of a boundary scheme and initial conditions of the model. We are interested in 
the evolution of a pure perturbation due to boundary scheme. 

We have intentionally distinguished two types of operators in the right-hand-side of ([7]) 
(underlined by one or two lines) . Operators of the first type act in the space of perturbations 
of u, v and rj. Both the argument and the result of these operators are in the space of Su, Sv, Sr]. 
Operator G x , for example, brings the variable Srj to the space of Su, operator S y S x transforms 
the variable Su to the space of Sv, etc. 

The second type of operators distinguished in ([7]) (underlined by two parallel lines) are of 
implicit structure. Their argument in the present form is contained in the matrix itself. Let 
us consider, for example, the expression 5G x rj. True argument of this expression is contained 
in the matrix SG X rather than in the vector rj. In this form, it is the matrix that depends on 
the variations of the control parameter 5a. The vector rj, that can be wrongly considered as 
an argument in this expression, is the solution of the original shallow- water model ([T]). This 
vector depends implicitly on the control parameters a, but does not depend on variations Sa. 

Expressions like 5G x rj are not convenient to carry out further development. Writing an 
adjoint operator, we would better have a constant operator (which does not depend on 5a) 
multiplied by a variable vector (which depends on Sot) . It would be more convenient to rewrite 
products like this in a more explicit form 

SG x r) = rjSa 

where the operator fj is constructed from the solution 77 of the original equation and the vector 
Sa is extracted from the matrix SG as it is shown in [23] • All other products of the second 
type {SD x u 5D y v, 5S x u, etc) are also rewritten in this, more explicit form. We shall further 
use hats to denote matrices that have been constructed from vectors. These matrices are also 
block-matrices. All elements of their blocks are equal to zero except M lines in the beginning 
and M lines at the end of each block. These lines are composed of K values of corresponding 
vector, and, namely, values of approximated function in K nodes near the boundary. 



dSu 
~dt 

dSv 

dSrj 
~~dt 
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In order to write the tangent model in a matrix form we add one additional equation to 
the system ([7]) expressing that control coefficients a are stationary: = 0. Using these 

notations we write: 





( Su \ 


( 


d 


Sv 




dt 


Si] 








\ 



-a + fiA h 
-(/o + (3y)S y S x 
—H D X 




(/o + (3y)S x S y 
-a + fiA h 

— H Dy 





-gG x 
-gG y 







R v 







f 5u\ 




Sv 




Srj 


) 


{ 5a ) 



(8) 



where 

R u Sa 
R v Sa 
Rjj 8a 



(fa + (iy)(5S x S y v + S x 6S y v) - gSG x i] = [(/ + (3y)(S y v + S x v) - gf}]6a 
-(/o + fiy){5S y S x u + S y 5S x u) - g8G y rj = [-(fa + [3y)(S x u + S y u) - grj]5a 
[-H (8D x u + 8D y v) = -H (u + v)]5a (9) 



It has to be noted, that operators R u , R v and R v act from the space of the control variable 
a to the space ofvariations of the model's solution Su, Sv or Srj. Their matrices, consequently, 
are rectangular. The number of lines in their matrices corresponds to dimensions of variables 
in physical space, i.e. N x (N — 1) for u and v and (N — l) 2 for rj. The number of columns 
corresponds to the dimension of the control variable's space or to the total number of control 
coefficients a. This number is equal to the number of controlled operators (eight in the present 
configuration) multiplied by the number of boundaries to control for each operator (usually 
two: either East- West if the operator is in x direction, or North-South if the operator acts 
in y) and multiplied by the number of control coefficients for each boundary (M x (K + 1) 
as defined by ©). The smallest number of control coefficients in the present configuration is 
equal to 48 (K = 2, M = 1). Of course, this small number is not always sufficient. In general 
case, we have to use more controls to get better results, but in any case the number of control 
coefficients is much less than the dimension of the variables space. 

This fact may be taken into account in the analysis of the complexity of variational data 
assimilation in this case and in the making choice of the practical way of calculation of the cost 
function. We can see, the matrix of the tangent linear model |5J is composed by two parts: 
the 3x3 block composed of operators acting in the space of the model's variables and the 
fourth column composed of operators R. The 3x3 block is responsible for the evolution of a 
small perturbation by the model's dynamics, while the column determines the way how this 
perturbation is introduced into the model. The first term is similar for any data assimilation, 
while the second one is specific to the particular variable under control. This term is absent 
when the goal is to identify the initial conditions of the model because the uncertainty in initial 
conditions is introduced only once, at the beginning of the model integration. But, when the 
uncertainty is presented in an internal model parameter, like in this case, the perturbation is 
introduced at each time step of the model. 

So, if we intend to identify an optimal boundary parametrization for a model with an existing 
adjoint developed for data assimilation and identification of its initial state, we can use this 
adjoint as 3 x 3 block because this part is common for any data assimilation. However, the 
fourth column of the matrix (|8|) must be developed from the beginning because it is specific 
to the particular control parameter. This development may be technically difficult. One can 
see, writing adjoint of operators R © is as complex as the adjoint of the 3x3 block. If we 
add nonlinear terms to the model, writing the fourth column will become even more complex 
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than the development ol the 3x3 block. Including, for example, v §p = S x S y v S y D y u term 

in the first model's equation will add six additional terms in the tangent model: two terms in 
the 3x3 block and four terms in the fourth column. 

Regarding the supplementary work to be done developing the tangent and adjoint models 
and taking into account small number of control parameters, it may be reasonable to try to 
calculate the gradient of the cost function by some other method beginning with the simplest 
finite difference method. Of course, this will be more expensive computationally, but the gain 
in the development procedure may compensate this excessive computational cost. 

Moreover, boundary conditions and discretizations of operators near boundary are internal 
model's parameters. They must be identified once for all model runs, while initial conditions 
are external parameters and must be identified for each particular model run. Consequently, 
we have to identify the discretization near boundary on the stage of the development of the 
model while choosing all other internal parameters. Hence, we can accept more time consuming 
data assimilation procedure under condition that it requires less time for it's development. 

However, in this paper we proceed with the calculation of the gradient of the cost function 
using the adjoint of the tangent linear model ([5]). In order to develop the adjoint model, we 
need to introduce the scalar product in the space defined by tangent model. Each element in 
this space is composed of discretized functions u, v and r\ and also the whole set of the control 
coefficients a. A vector in this space has four components <p — ( u , v , Vj a )- 

In this paper we consider a weighted Euclidien scalar product in this space 





( U \ 
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U* 


\ 
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V* 
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J2 U i 
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(10) 
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k,m, operator 
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k,m 
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k.m 



W v 



Denoting the matrix of the tangent model 



The sums in the scalar product is performed over all nodes i,j of the grid of all model's 
variables u, v and 77. The sum of control coefficients a is performed over all k, m : 1 < k < 
K, 1 < m < M §5§ and over all operators "operator" controlled by these coefficients. Weights 

' and w n = -rl- are introduced to bring all variables to dimensionless form. 
' no 

by A, we write it in a short form: 
I \ 

— -A5<p = 0, 6<p \ t=Q = Q 

\ 5a J 

Following [3], [5], we multiply the tangent model by some <jf(t) and integrate this product in 
time from to some T . 

T T T 

<iA54>, dt = 

00 

T 

<^(T),0*(T)» - <c50(O),0*(O)» + J <Scj>{t),-^-- A*4>*r> dt (11) 

n 



= 



1 1 

J <^-A5tt>),(f>*(t)>dt = J <^,^(t>dt- 



dt 
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where A* is the adjoint matrix to A. Consequently, if <f>* satisfies the adjoint model 



— A*cj>* — 0, then we have an equality 

<£<f>(T), 9i>*(T)>=<<ty>(0), 0*(O> 

that we shall use to calculate the gradient of the cost function. 

Complete matricial formulation of the adjoint model, hence, writes: 



(12) 



d_ 

at 



( u * \ 


( 


V* 




V* 

V «* J 


V 



—a 
-(/oH 



f iA h 

^y)s* y s* 

£?* 



(fo + (3y)S*S* y 
-a + {iA h 

gG* y 

r>* 



-H D* 

-h d; 



R* 





( 


u* \ 






V* 






V 


) 


\ 


a* J 



(13) 



where 



■m 



(14) 



K = (fo + Py)(s y v +d*s* x ) 

R* = -H (u*+v*) 

The result of the adjoint model run </>*(0), that is used in (fT2"]) , we shall denote as 

0*(O) = A*{T,0)<j)*{T) (15) 

in order to explicitly show that it was obtained by the adjoint model (|13[) that starts at time 
t = T from the state 4>*(T) and is integrated back in time to t = 0. 

2. 3. Cost function 

One of principal purposes of variational data assimilation consists in the variation of control 
parameters in order to bring the model's solution closer to the observational data. This 
implies the necessity to measure the distance between the trajectory of the model and data. 
Introducing a cost function, we define this measure. Generally speaking, the cost function 
is represented by some norm of the difference between model's solutions and observations, 
eventually accompanied by some regularization term. 

In this paper we use the Euclidean norm because in this preliminary study we are interested 
essentially in the principal possibility to control the boundaries by data assimilation, rather 
than to the difficulties that arise due to imperfect data. Consequently, all the artificial 
observations to be assimilated into the model are considered as perfect, known at any time and 
at any point for all variables it, v and ?y, and no regularization is added to the cost function. 
To characterize the difference between the model's solution and the observational data we use 

i,j i,j i,j 

The norm £ is, in fact, associated with the scalar product (fTO]) with the fourth component a 
equal to zero. Expressing £ in terms of the scalar product, we emphasize its dependence on 
time and control coefficients a: 



e = e{a,t)=<${ot,t)-<i> obs {t),ci>{<x,t) 

(t) \ ( u{a,t) 
via, t) 



/ u{a,t) 
v(a, t) 
r)(a,t) - Tj 




obs { 



v obs (t) 



(t) 



J 



o6s (f)>= 
obs {t) \ 



„obs 

oh 



Tt(a,t)-if»(t) 




» 



(17) 
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The choice of the cost function for variational data assimilation performed to identify 
discretizations of model's operators near boundaries is not obvious. If the purpose is to identify 
the initial conditions of the model, it is reasonable to choose a classical 4D-VAR cost function 

T 

l 4 (a) = J f(a,t)dt (18) 
o 

that gives the same importance to the difference £ 2 at any time. However, assimilating data 
with the purpose to identify an internal model's parameter, it may be more reasonable to 
emphasize the end of the assimilation window [0,T] with respect to it's beginning because in 
the beginning of the model's integration the value of is low and less important. Indeed, 
supposing observational data to be perfect, we start from the exact initial state obtained from 
artificial observations and get £(0) = 0. 

Consequently, we shall try to use two other cost functions 

T 

l fp (a) = £ 2 (a,T), T T (a) = J ' t£ 2 (a,t)dt (19) 

o 

requiring that the end of the interval is weighted more heavily. 

To calculate the gradient of the cost function, we calculate first its variation: 

Slf p — lf p {a + 5a) - lfp(a) = £ 2 (a + 5a 7 T) — £ 2 (a,T) = 

= <4>(a + 5a, T) — 4> ohs (T) , 4>{a + Sa,T) — 4> ohs (T)> - 

- <#(a, T) - <f> obs (T), <j>(a, T) - cj) obs (T)^>~ 

~ 2«^T),0(a,T)-0 obs (T> (20) 

This scalar product is similar to the scalar product in the left-hand-side of (fT2"| with <fi*(T) 
replaced by <j>{a,T) - <j) obs (T). Consequently, if we run the adjoint model (|13p from the state 
4>*(T) = (f>(a,T) - 4> obs {T) back in time from t = T to t = we obtain the state </>*(0). This 
state, being scalarly multiplied by 5(f)(0) provides the variation of the cost function 

51 fp = 2<:5<p(0),<p*(0)» (21) 

As it has been noted, first three components of (50(0) are equal to zero. Consequently, first 
three components of <fi*(0) are multiplied by zero and we are interested only in the fourth 
component of the result of the adjoint model a* . 

Thus, the gradient of the cost function is related to the fourth component of 0*(O) 

Vl fp = 2[0* (0)] 4 - 2[A*(T,0ma,T) - obs (T))] 4 (22) 

and namely to the a*(0) is obtained as the final point of the adjoint model integrations from 
t = T to t = beginning with the state <p*(T) = tf>(a, T) ~ (f> obs . 

Gradients of two other cost functions can be obtained in a similar way 

T 

VX 4 = 2 J A*(T,t)(<jy(a,t)-<j) obs (t))dt 
o 

T 

V2r = 2 / tA*(T,t)(<j)(a,t)-<j) obs (t))dt (23) 
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These gradients are used in the minimization procedure that is implemented in order to find 
the minimum 

1(a) = minX(a) (24) 

a 

Coefficients a are considered as coefficients realizing optimal discretization of the model's 
operators in the boundary regions. 

The minimization procedure used here was developed by Jean Charles Gilbert and Claude 
Lemarechal, INRIA [28]. The procedure uses the limited memory quasi-Newton method. 



3. Stationary solution of the model 

In this section we perform several experiments with the stationary model's solution that 
contains a strong boundary layer. The model is considered in a square box of side length 
L = 4000 km driven by a steady, zonal wind forcing with now classical sinusoidal profile 

My - l/2) 

T x = T COS 

that leads to the formation of a double gyre circulation 29J. The maximal wind tension on 

the surface is taken to be To = 1.5 c ^ ne . 

cm 

As it has been already noted, the Coriolis parameter is linear function in y with f = 10 _4 s _1 
and (3 = 2 x 10 -11 (ms) _1 . The reduced gravity and the depth are respectively equal to 
g = 0.02^, Hq — 1000m. The coefficient of Eckman dissipation is chosen as a — 5 x 10~ 8 s , 

that corresponds to the damping time-scale T CT = 2 x 10 7 s ~ 200 days. The lateral friction 

2 

coefficient \i has been taken to be /i = 800^-, that corresponds to the damping time scale 
Tfj, = 2.5 days for a wave of 100 km length. 

All operators in the model are approximated with the second order accuracy both in the 
interior of the domain and near its boundary. That means all the coefficients a have been chosen 
to provide second order discretization as it has been discussed in the section |2~T1 The expression 
(|5j). that is used to interpolate functions and to calculate their derivatives near boundary, is 
written with K = 2, M =1, ao — 0. Coefficients a are defined as a\ t \ = —1/h, a^.i = l/h for 
all derivative operators and a^i = a2,i = 1/2 for all interpolations. That gives, for example, 

the value of the derivative of u at the point i = 1/2 as ( D x u)x/2 t j-\/% — — ^— . ^Oj'- 1 / 2 _ 
The stationary solution of the model in this configuration possesses a boundary layer near 
the Western boundary. This is a well known Munk layer [3D] that is characterized by the local 
equilibrium between the /3-effect and the lateral dissipation. It's width can be easily calculated 
by solving the Munk equation 

d 3 v 1 /u\ 1/3 

_ _ - v = o, v(0) = 0, «(oo) =0,6= (J J (25) 



The solution has a form 



v(x) = Ce 26 sin(- 



26 



The width of the Munk layer for given parameters is equal to ^j=(p/ /3) 1 / 3 = 124 km. 

v3 
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As it has been discussed in [3T], the model must resolve this layer with at least one grid 
point (optimally more than one grid point) in order to maintain numerical stability. The work 
of [32] emphasized the importance of having at least two grid points in the Munk layer in order 
to minimize the level of spurious oscillations visible in the velocity fields as well in the field of 
the sea surface elevation. 

The resolution of the model in this section is intentionally chosen to be too coarse to resolve 
the Munk layer. The model's grid is composed of 30 nodes in each direction, that means the 
grid-step is equal to 133 km, that is more than the Munk layer's width. As it can be seen in 
figflj there is only one grid node in the layer for variables v and rj and no nodes for u variable. 

Artificial "observational" data are generated by the same model with all the same parameters 
but with 9 times finer resolution (15 km grid step). The model with fine resolution, having 8 
nodes in the Munk layer, resolves explicitly the layer and must have no spurious oscillations. 
All nodes of the coarse grid belong to the fine grid, consequently, we do not need to interpolate 
"observational" data to the coarse grid. We just take values in nodes of the high resolution 
grid that correspond to nodes on the coarse grid. 

Along with the models on the fine grid that is used to generate "observations" and on the 
coarse grid that is used to perform data assimilation experiments, we also use the medium 
resolution model on the grid 3 times finer than the coarse grid. The grid step of the medium 
grid is equal to 44 km, there are 3 nodes in the Munk layer, which is, hence, relatively well 
resolved. 

In order to compare these three models, we use the v velocity field because under-resolution 
of the Munk layer in the most visible in the field of this variable. If we compare profiles of the 
velocity v(x,y) plotted in figJ5]at y = 1000 km (one quarter of the side length), we see that 
the profiles of the models with fine and medium resolutions coincide almost everywhere. The 
only visible difference is seen between the second and the third nodes of the medium-resolution 
grid. The maximum of velocity is located just between these nodes and is not reproduced on 
the medium grid. However, values of the velocity in nodes adjacent to the maximum are close 
to corresponding values of the model on the fine grid. 

The profile obtained with the coarse resolution model (solid line in figE]) shows strong 
spurious oscillations due to the unresolved Munk layer. As expected, one point in the layer is 
not sufficient to suppress the numerical mode in the velocity field. Similar oscillations are also 
present in the fields of the velocity u and SSH r). 

Performing the data assimilation we hope that the optimal discretization of operators near 
the boundary may help us to suppress these spurious oscillations even working on the coarse 
grid. 

3.1. Parameters of the assimilation procedure 

In this section we shall discuss such parameters of the assimilation procedure as the length of 
the assimilation window, the choice of the cost function among I4, Xf p andXy and the physical 
meaning of the control parameters and their eventual dependence on the coordinate. As the 
first experiment, we identify only three coefficients for each operator and each boundary. That 
means, we take K = 2, M = 1 in ([5]). 

We can note that this choice is almost equivalent to the control of the boundary conditions 
prescribed for u and v variables. Indeed, supposing, for example, the boundary conditions 
for u{x,y) for theoperator ^ at x — are expressed according to ©, the derivative 
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Figure 2. Profiles of the velocity v(x, y) at y = 1000 km obtained on the coarse grid with h — 133fcm 
(solid line), medium grid with h = 45 km (dashed line) and fine grid with h — 15 km (bold dashed 

line) 



du 



^ _" r — ^ . Therefore, if we want to control boundary conditions, we can 



of u at x = h/2 that we need in the third equation of the model |T]) can be found as 

_ VM 12, 

1/2 

control either n and in the expression of boundary conditions or a^f" = -jf^y an d 
a 2\ U = ^ r m discretization of the operator ((5]). The result will be the same in both cases: 

we control two parameters looking for minimizing the same cost function (the coefficient ol^\ " 
is not controlled due to keeping formal Dirichlet condition u(0,y) = 0). 

However, even in this case there is a difference between controlling the boundary conditions 
and controlling the discretizations of operators. Controlling the boundary conditions and 
finding optimal r\ and suppose using the same r± and ri for the interpolation operator 
near boundary. But finding the optimal a, we suppose that a D * u and a *> u are independent 
on each other. 

Moreover, we admit that coefficients in the gradient are also controlled by data assimilation, 
despite no boundary condition is prescribed for r\. This provide additional control that may 
help to improve the solution near the boundary. 

From the physical point of view, assuming variable coefficients (either r\ and T2 or a^f n and 
a 2\ U we accept the flux can cross the boundary. Impermeability condition it(0,y) = is no 
longer maintained. This fact must be addressed in the first turn, because it represents a major 
violation of the model's physics. Allowing the fluid to leave the domain in some place, we 
must allow it to come back in some other place in order to be able to keep the mass constant. 
In the present formulation, the coefficient that determines non null flux crossing the 

boundary, is constant. That means, the flow may cross the boundary in the same direction 
all the boundary's long. The flux can not change direction and reenter the domain in some 
place after leaving it in some other place. Consequently, in order to allow the flux to come 
back, we must at least allow control coefficients ao to depend on the coordinates: latitude y 
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Figure 3. Convergence of the cost function in the minimization procedure (left) and evolution of 
the norm of difference with the reference solution (right) in the experiments with all coefficients a 
particular for each point (solid line), with just ao depending on the coordinate (small dashes) and all 

uniform a (long dashes). 



corresponding to the index j for operators S x , D x and G x , and longitude x or index i for 
operators S y , D y and G y . Thus, we obtain a particular discretization of the operators at each 
point. Parameter a® 1 ^ in the expression ([5]) will have a form {a.Q^)j. We can also allow all 
parameters ak. m to depend on the coordinates. 

The initial guess for the minimization procedure is taken as the classical second order 
approximation, the same as the approximation that has been used to run the fine resolution 
model to generate artificial "observations". Three experiments have been performed. In the 
first experiment we allow all a to depend on the coordinate on the boundary; in the second 
experiment it is ao only that depends on coordinate, and in the third experiment all a are 
independent. Assimilations was performed minimizing the 4D-VAR cost function Zj (JTSJ) with 
assimilation window T — 50 days and the results are shown in fig[3j The assimilation window 
is delimited by the vertical dashed line in the figure. 

As one can see in figEJ 150 iterations are sufficient for the minimization process to converge 
in general. In all subsequent iterations the value of the cost function is modified just a little 
and can be considered as already converged to the same value in all three experiments. We 
can notice only a slightly different convergence rate. When the coefficients a are particular for 
each grid point, the convergence is a little slower. The procedure takes about 150 iterations to 
achieve the value obtained after 90 iteration when the control coefficients are uniform or partly 
uniform (just a depends on the coordinate). The final value of the cost function is 80 times 
lower than the starting value, that means the solution is much improved in the assimilation 
process. 

However, the particularity of the data assimilation applied to control the discretization 
of operators consists in the fact that obtained numerical approximation of derivatives and 
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interpolations can be instable. In the assimilation window this numerical instability may be 
dumped by the minimization, but it may bring the solution to infinity beyond the window. 

One can observe this phenomenon in the fig[3] on the right where the evolution of the 
norm of the difference £(t) (flU)) is plotted during 500 days (the assimilation window, as 
it has been already mentioned, is 50 days long). The solution obtained by the model with 
uniform coefficients a is indeed instable. In the assimilation window the line that corresponds 
to this solution is indistinguishable from two other lines, but, after the assimilation's end, the 
difference grows up and leave the picture's frame. 

Hence, as it has been shown above, using all uniform coefficients is not the best choice. Mass 
and momentum fluxes can not be balanced on the boundary and the assimilation procedure 
have to look for the optimum in the domain where the scheme is instable. Consequently, at 
least ao should be allowed to be variable and specific in each point near the boundary in order 
to allow the integral flux across the boundary to be balanced. This requirement substantially 
increases the dimension of the control space. If all control coefficients do not depend on the 
coordinate, the number of coefficients for the linear shallow water model may be as few as 
48. Requiring the dependence of ao on the position of the boundary point, we increase the 
dimension of the control space up to 32 + 16 x N, where N is the number of grid nodes on the 
boundary part where the control is performed. In the experiment with the coarse resolution 
(h = 133 km, N — 30), for example, the dimension increases from 48 to 512. This may be 
really penalizing when we try to avoid the development of the adjoint model and calculate 
the gradient of the cost function by a finite difference method. However, even in this case the 
number of coefficients to control is much less than the dimension of the model's state (900 
points for each of 3 variables). Moreover, if we refine the grid, the number of coefficients will 
always be less than number of model's variables because the number of a is proportional to 
the length of the boundary, while the dimension of the model state is proportional to the area 
of the domain. 

So far, the difference between experiments with all variable coefficients and with variable 
ao only is low, we shall use this configuration in all experiments below. That means, all the 
approximations near boundary will be calculated by the formula 

{D x u) m _ 1/2d -i/2 = \ (("£m )j + a k,m U k-l/2,j-l/2j for TO = 1,2, . . . , M (26) 

The set of optimal coefficients a obtained in this assimilation is the following. All coefficients 
a other than ao have moved very little from their initial guesses (less than 0.01), except a Sx?) 
on the left boundary. The approximation of the interpolation S x rj become 

(S x r))x,j-i/2 = ("o x, ')j + °- 46? ?i/2,j-i/2 + 0.52% /2 ,j-i/2 (27) 

On the other hand, coefficients ao have been really used as controls in this experiment. Their 
deviation from the initial guess (zero) is non negligible and, indeed, they depend on the 
coordinate. Three of them, that have been obtained for the operators in x ( (&()i U )j> ( a o 
and (aQ xV )j) are shown in figd] on the left, and three others, for the operators in y 

(( a 0,i")ii ( a o*i)i and { a o yV )i) on tne right- 
On the left part of this figure we can see two important phenomena. The first one consists 
in the fact that the flux across the boundary is balanced. Despite no balance requirement has 
been imposed in the data assimilation procedure, the integral of the flux over the boundary is 
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Figure 4. Optimal coefficients Qo for operators in x (left) and operators in y (right) as functions of 

y = jh and x = ih. 



close to zero. The normal component of the velocity on the left boundary (i.e. u component) 
shows that the flux leaves the domain on the North and on the South of the boundary and 
returns back in the middle. 

The second phenomenon consists in the fact that coefficients (ctQl u )j and {otQ\)j have a 
certain similarity. As it has been already mentioned, they have been considered as completely 
independent on each other, but the result shows the coefficients (a^yi)] are very close to one 
half of 

Analyzing coefficients «o for operators in y shown in fig[4]on the right, we see their values are 
much smaller than for operators ins. It is reasonable because there is no boundary layers near 
the North and near the South boundaries. These coefficients have maxima only in the North- 
western and the South- Western corners of the domain because of influence of the western 
boundary layer. Out of this layer, only coefficients a^ yr} (solid line) show significant difference 
from zero. The operator S y rj is applied to the result of the interpolation S x u. That means the 
interpolation of the velocity u to points v is performed by formula 

/ S y n, , Uj+1,3/2 + u i,3/2 + «i,l/2 + "i+1,1/2 

Ui + i/2,i = (<V H + ■ 4 

which differs from the standard second order interpolation only by adding (aQ yn )i that 
represents a flux to the West in the western half of the domain and the flux to the East 
in the eastern half as it is shown in figdjon the right. 

The next question we address in this section concerns the length of the assimilation window 
and the form of the cost function. A priori, we know the procedure is computationally less 
efficient when T is long. Computing time per iteration increases with T because each iteration 
is composed by the integration of the direct model from to T and corresponding backward 
integration of the adjoint model. Consequently, it would be more computationally efficient to 
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Table I. Values of £ obtained by assimilation with 4 different assimilation windows and 3 different 

cost functions. 



T 


Number of 




SCO 






5(1000 days) 




(days) 


Iterations 


using 1 fp 


using Xj 


using It 


using l fp 


using X4 


using It 


0.2 


10 


0.042 


0.042 


0.042 


00 


00 


CO 


0.2 


50 


0.041 


0.041 


0.041 


00 


00 


CO 


0.2 


200 


0.040 


0.040 


0.040 


00 


00 


CO 


2 


10 


0.12 


0.041 


0.09 


1.79 


0.74 


1.88 


2 


50 


0.031 


0.032 


0.032 


0.09 


0.05 


0.04 


2 


200 


0.029 


0.030 


0.030 


1.22 


CO 


CO 


20 


10 


0.27 


0.26 


0.27 


0.56 


0.54 


0.64 


20 


50 


0.11 


0.064 


0.08 


0.24 


0.17 


0.14 


20 


200 


0.04 


0.037 


0.04 


0.050 


0.040 


0.036 


200 


10 


0.25 


0.25 


0.25 


0.25 


0.25 


0.25 


200 


50 


0.21 


0.21 


0.21 


0.21 


0.20 


0.19 


200 


200 


0.05 


0.04 


0.03 


0.049 


0.039 


0.035 



choose a small T. However, it is the precision of the assimilation that we want to improve and 
the precision must be used as criterium in the choice of the assimilation window. 

The choice of the cost function among Z4, Tf p and It is not obvious also. As it has been 
already noted, the cost function 2" 4 gives the same importance to all part of the assimilation 
window, while It and Tf p consider the end of the interval to be more important than the 
beginning. 

In order to see the influence of the form of the cost function and the assimilation window, we 
perform the set of experiments with different cost functions and different assimilation windows. 
In all experiments we stop the minimization process at different stages of convergence: in the 
very beginning, just after 10 iterations; in the middle after 50 iterations and after 200 iteration 
allowing almost complete convergence. So far, final values of different cost functions are difficult 
to compare among them, we shall compare final values of £ (|16[) that is independent on the 
cost function in use. Moreover, in each experiment we shall compare not only £(T) calculated 
at the end of the assimilation window, but also £(1000 days) in order to distinguish those 
assimilations that result in an instable scheme. 

Analyzing the Table HJ we can note several interesting particularities. First of all, when 
assimilation window is short, the convergence is very rapid. In fact, 10 iterations are sufficient 
for the minimization to converge when T — 0.2 days (just two time steps of the model). Taking 
into account that iterations with a so short T are extremely cheap in computer time, it seems 
to be advantageous to perform assimilations with short windows. However, these assimilations 
result in an unstable scheme. Two time steps is too short interval to develop the instability. 
The cost function contain no information about stability of the scheme, consequently, the 
minimization chooses a set of coefficients a with no requirements about stability of the obtained 
scheme. 

If we consider assimilation window T = 2 days, we see another interesting particularity. Ten 
iterations are no longer sufficient for the minimization to converge, but after 50 iterations the 
minimization has almost reached the minimum. Next 150 iterations allow just to reduce the 
cost function by 0.002, i.e. less than 10%. But these 150 supplementary iterations transform a 
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stable scheme to an instable one. After 50 iterations the scheme was perfectly stable allowing 
the model to be integrated as long as necessary. At the end of 1000 days model's run we get 
rather low difference with the reference solution of the model on a fine grid. But continuing 
minimization we win a little in the cost function loosing the stability. Consequently, we should 
assume the optimal discretization of operators near the boundary is situated not far from the 
stability limit. Performing more iterations than some maximal allowed quantity, we can cross 
the stability limit and obtain an instable scheme. This maximal number of iterations depends 
on the assimilation window. The scheme becomes instable just after several iterations when 
T = 0.2 days because there is very few information in the cost function about stability. When 
T = 2 days, the cost function knows more about stability (the solution must not diverge 
during at least 2 days), and the minimization supports several dozens of iterations before 
become instable. The stability is not guaranteed even with T = 20 days. This experience is 
not shown in the Table []] but after 1000 iterations we obtain an instable scheme also. 

The conclusion, hence, is simple: short T ensures a rapid convergence, but long T is necessary 
to ensure the stability. Reasonable strategy of data assimilation in this circumstances may 
consist in using variable assimilation window. Performing few iterations with short T we 
obtain the set of coefficients a that ensures low cost function value. This set is corrected 
by assimilation with longer T in order to ensure the stability of the scheme. Of course, it is 
important to restrict the number of iterations with short windows. Otherwise, assimilation 
with long windows will not be able to correct instabilities of the scheme. 

Looking at the results obtained using different cost functions in the assimilation procedure, 
we see just a little difference in final values of £(1000 days). However, the final £ obtained 
with It is a little lower than with any other cost function. Consequently, the hypothesis that 
the end of the assimilation window should be more heavily weighted is reasonable. Excessive 
weighting of the end of the window by If P , however, provides worse results. When we use 
just the final point in the cost function, the value of £(1000 days) is bigger than in all other 
experiments. 

Taking into account the results obtained in this section, we shall use the cost function It and 

perform the data assimilation with variable assimilation window T in all experiments below. 

A comparison of the convergence of the cost function in experiments with fixed assimilation 

windows of different length and with variable windows is shown in figO So far, the value of the 

2X 

cost functions depends explicitly on T, we normalize it and plot the value -7=7- which depends 
on T only implicitly. 

In this figure one can see the same phenomenon as in the Table |TJ When the assimilation 
window is short, the convergence is rapid. Just several iterations are sufficient for minimization 
to converge when T = 0.2 days or T = 2 days. Unfortunately, as we have seen in the Table U 
obtained numerical schemes are instable. When the assimilation window is long, more iterations 
are necessary to converge (about 100 when T — 20 days and about 200 with T = 200 days). 
Moreover, each of these iterations is more expensive in computer time. 

II 

One can see in fig[5] also, the convergence of -^f- with the variable window is rapid, the 
final value of the cost function is the same and obtained set of coefficients a provide a stable 
scheme. In the experiment with variable window we have performed 4 iterations with T — 0.2 
days, 8 iterations with T = 2 days and 20 iterations with each of T = 20 and 200 days. In fact 
the computing time in the experiment with variable window is equivalent to 23 iterations with 
T = 200 days, that means we obtain the same result almost 10 times more rapidly in terms of 
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Figure 5. Convergence of the normalized cost function in the assimilation procedure with different 
assimilation windows and with sequentially increasing T. 



CPU time. 

3.2. Assimilation window and the number of control parameters. 

As it has been already noted, the difference with controlling boundary conditions consists in 
the liberty to increase the number of controlled parameters a increasing either K or M (or 
both) in ([26]) . In this section we perform a set of experiments with different K or M looking 
for the solution of the model on the coarse grid that is close to the reference model's solution 
on a fine grid. To measure the distance between two model's solutions we use the norm of the 
difference £ (fl6|) between them taken at 1000th day of the model time. The time 1000 days is 
chosen as sufficiently long time at which all the intermittent processes are over and the model 
has reached its equilibrium, (the model is linear and dissipative, so the equilibrium exists and 
represents the global attractor of dimension 0). From the Table Q] we see that with K = 2 and 
M = 1 we can obtain £(1000 days)=0.035 while the value of £(1000 days) calculated for the 
original coarse resolution model with classical discretizations of all operators is equal to 0.46. 
That means, performing data assimilation, we have obtained the solution that is more than 
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10 times closer to the reference one than the initial guess. In fact, optimal discretization of 
operators near boundary has allowed us to obtain the solution on the coarse grid (grid step 
h = 133 km) that is similar to the solution on the medium grid (h = 44 km). Indeed, the 
difference £ between equilibria on the medium and on the fine grids is equal to 0.037 that is 
almost the same as for the coarse grid's solution with optimal a. The profile of the velocity 
v(x, y) with x = 1000 km of the solution on the medium grid is presented in figH] together with 
coarse grid's and fine grid's solutions. We see that there is no spurious numerical oscillations 
in the solution on the medium grid. Hence, the data assimilation helped us to remove these 
oscillations on the coarse grid also. 

However, increasing K and M does not help to diminish the difference with the reference 
model. Several experiment have been performed with M — 2 and M — 3 and with K varying 
from 2 to 7, but no significant improvement has achieved. The lowest value of £ obtained in 
these experiments is equal to 0.012. This value is slightly lower than 0.035 obtained with K = 2 
and M = 1, but these values are of the same order. 

That means controlling the discretization of operators near boundary we can avoid spurious 
numerical oscillation due to under-resolved boundary layer and obtain the solution similar to 
the solution on the 3 times finer grid. But we can not go further and to get a solution closer 
to the reference one. This phenomenon can be explained by the analysis of the error field, i.e. 
the differences (uij — u*), (vij — v° b ?) and (rjij — Vi h j S ) that are used to calculate £. The 
error in the initial guess is concentrated in the boundary layer while the model with optimal 
discretization near the boundary provides a solution with more uniform error's distribution 
in the whole domain. Thus, the error in the sea surface elevation in the initial guess reaches 
30 meters in the boundary layer and 2-3 meters in the eastern part of the square. But after 
data assimilation we get an error of 1 meter near the western boundary and 0.3 meter in the 
eastern part. More uniform distribution of the difference with the reference solution indicates 
that the error produced by numerical scheme in the interior of the domain become relatively 
more important. And so far, we do not control the interior scheme we can not reduce this 
error. No matter how many a we use to control the approximation near the boundary, they 
can not improve the approximation in regions far from the boundary. 

In order to verify this hypothesis, we perform a similar set of experiments with the fourth 
order approximation's schemes in all internal points of the domain. As well as before, all 
operators have been approximated by ((3]), but with coefficients a® — 2jr(l; — 27, 27, — 1) for 

k = —2,-1,0,1 for derivatives and af = g(— 1,9, 9,-1) for interpolations. The same initial 
guess a as in experiments with the second order approximation near the boundary has been 
used. 

Data assimilation in all these experiments has been performed with variable windows from 
T = 2 to T = 200 days. As we have seen before, obtained numerical scheme may be instable 
and generate a rapidly divergent solution after the end of assimilation. For this reason, together 
with the cost function over the assimilation window T we look at the value of the norm of 
the difference between the solution on the current iteration and the reference solution £ (|16p 
taken at the end of the test interval, i.e. on the 1000th day (i.e. 5 times the longest assimilation 
window) . The minimization process is interrupted when the scheme begins to loose its stability. 
That means the value of £ taken at the end of the test interval T = 1000 days starts to grow 
despite decreasing of £ at the end of the assimilation window. The lowest value £(1000 days) 
obtained in the assimilation is shown in the Tabic HT1 
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Table II. Values of £(1000 days) obtained by assimilation with different number of control coefficients 

by second and fourth order schemes. 



K and M 


Second order 


Fourth order 


Initial Guess 


0.46 


1.3 


Medium grid 


0.039 


0.028 


K = 2, M = 1 


0.035 


0.15 


K = 3, M = 1 


0.035 


0.15 


A" = 5, M = 1 


0.035 


0.15 


K = 5, M = 2 


0.018 


0.010 


# = 7, M = 3 


0.013 


0.001 


K — 7, M = 4 


0.012 


0.0004 



Considering the initial guess, we can see, that the value of £ at the end of test interval is 
bigger for the model with the fourth order scheme than with the second order one. The reason 
of this is clear: high order scheme work worse when principal physical scales are not resolved 
explicitly. Indeed, the coarse grid of the model does not resolve the Munk boundary layer (|25|) . 
The grid step h is approximately twice the Munk parameter 26 = 2(/i//3) 1 / 3 . But it is the 

ratio (^jfi^J where n is the order of approximation that determines the approximation error 

(U). Higher is the order of approximation, bigger is this ratio and bigger is the error in the 
approximation of the boundary layer. 

Another interesting phenomenon that can be seen in the Table |TT] consists in relative 
independence of the final value of £ on the number of control coefficients K . No matter 
how many terms participate in the linear combination that approximates the derivative or 
interpolates the function, the minimization procedure converges always to the same value. 
Obviously, this fact is in agreement with the feature discussed above. Using more terms 
in linear combination allows us to increase the approximation's order. One can construct a 
second order scheme with two terms only, but with 4 terms any approximation up to fourth 
order is possible. However, as we have seen, increasing the order of approximation does not 
help to improve the model on this coarse grid. Moreover, the interpolation obtained by data 
assimilation (|2~7| approximates the function with zero order O(l). Consequently, it is useless 
to increase the number of terms K in the linear combination because the data assimilation 
does not look for increasing the order. 

On the other hand, increasing M with the fourth order scheme really improves the result. 
Controlling numerical schemes in more than one point near the boundary allows us to obtain 
the solution which is very close to the reference one, obtained on the nine times finer grid. In 
fact, controlling a in one additional point divides the final value of £ by 10. When M is equal 
to 4, £(1000 days) is as small as 4 x 10~ 4 , that means the solution with optimal a is almost 
indistinguishable from the reference solution. 

Let us look at the set of control coefficients a obtained in the last experiment: K = 7, M = 4 
with the fourth order approximation in the interior part of the domain. It is this experiment 
that provides the closest trajectory to the reference one. This set of control parameters contains 
more elements that the set described in the previous section. We control discretization of 
derivatives and interpolation operators in four points near boundary (m = 1,2,3,4) and each 



Copyright © 200 John Wiley & Sons, Ltd. 
Prepared using fldauth.cls 



Int. J. Numer. Meth. Fluids 200; 0:1-34 



24 



E. KAZANTSEV 



AlphaJ) x 10" 3 



D_x(u) m=2 
Slx(ii) m=2 
- S_x(eta) m=2 




y(km) 



Figure 6. Optimal coefficients qq for m = 2 (left) and m = 3 and 4 (right) as functions of y = jTi. 



derivative and interpolation is approximated by linear combination of the function's values in 
seven adjacent points (k — 1, ... ,7). 

However, we get a very similar discretizations in the closest to the boundary points (m = 1). 
All coefficients a other than «o have also moved very little from their initial guess (less than 
0.01), except a Sxn on the left boundary. The approximation of the interpolation S x rj become 

{S x rj) x = (of"") + 0.46t7 1/2 + 0.50%/2 - 0.02^/2 - 0.006r/ 7/2 - 0.006t7 9/2 - O.OOltju/a 

Comparing this expression with (|2T[) . we can see that the difference consists just in replacement 
of the term 0.52t7 3 / 2 by the linear combination of r) in next nodes. 

Coefficients ao have also been really used as controls in this experiment and they have almost 
the same values as in the experiment with K = 2, M = 1. Their dependence on coordinate is 
very similar to the dependence shown in fig [4] 

The coefficients ctk,2 that are used in approximations in nodes next to the first has been 
modified more significantly than coefficients ctk,i in nodes adjacent to the boundary. The 
deviation from the initial guess exceeds 0.01 for all operators in the Munk Layer and for 
D y v, S y v, Syt] iisai the North and the South boundaries. Coefficients ao,2 and ao,3 ao.4 in 
the Munk layer are shown in figEl 

Comparing this figure with the fig[4]on the left, we can see that all a^", Q^", a^" and 
a 2 W are proportional to each other as well as and «g ^ ■ The data used to plot these 

figures allows us to show that — — ^7" anc ^ a o,^ U = 31 • 

Concerning operator S x r], we see that coefficients a x ^ and a Q may reach values 0.18 and 
0.04 respectively. Taking into account the fact that these are dimensional variables, that means 
additional elevation up to 18 and 4 centimeters is added to the interpolation of the sea surface 
hight near boundary. 
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Table III. Values of £(1000 days) obtained by assimilation of all 3 variables and of the altimetry data 

only. 







All variables 






Only r\ 




K and M 


2nd order 


4h order 


Iterations 


2nd order 


4th order 


Iterations 


Initial Guess 


0.46 


1.3 




0.46 


1.3 




K = 2, M = 1 


0.035 


0.15 


200 


0.042 


0.18 


1 200 


K = 5, M = 2 


0.016 


0.010 


1 000 


0.016 


0.015 


5 000 


K = 8, M = 4 


0.012 


0.0004 


3 500 


0.016 


0.0006 


15 000 



3.3. Using restricted observational data. 

In the previous section we have supposed that observational data are available for all variables. 
However, it is the altimetry that is the most probable candidate to be an observable variable 
in real data assimilations. In this section we suppose to have the sea surface elevation n as the 
only observable data. As well as before, these data are produced by a high resolution model on 
the 9 times finer grid, but we put w u = w v = in (|10p in order to neglect all the information 
about u and v in the cost function and in its gradient. 

However, assimilating the sea surface elevation only, we hope to be capable to identify 
numerical schemes for all variables. Consequently, comparing the quality of assimilation we 
keep former w u and w v in (|16p . Thus, the norm of the difference £ between the solution of 
the reference model and the solution with optimal numerical scheme on the boundary takes 
into account the information about all variables. The difference £ is, consequently, directly 
comparable with corresponding £ obtained in previous experiments. 

Several experiments with different K and M have been performed in the same way as in the 
previous section. We are also using £(1000 days) as the final measure of data assimilation's 
quality because this value, being taken well beyond the assimilation window, ensures the 
stability of the scheme. Results of the assimilation of altimetry data only are shown in the 
Table Mil together with several results of assimilation of all variables already shown in the 
Table HH 

Analyzing the Table HH we have concentrated our attention on the final results of the 
assimilation and namely on the value of the difference with the reference solution. Now, when 
we restrict the number of observable variables, we should also pay attention to the cost of 
assimilation. As it has been already mentioned, results in the Table lU have been obtained by 
data assimilation with variable window T. To create the Table IIII( we have re-obtained the 
same results assimilating data with a fixed window T — 200 days. The minimization procedure 
in each experiment has been stopped after stabilization of the cost function. The number of 
iterations in this case is proportional to the computer time spent to minimize the cost function. 
This value is, consequently, comparable in different assimilation experiments. 

Comparing the number of iterations shown in the Table IIII1 we see a valuable growth of 
computing time necessary to converge the minimization process. The number of iterations 
increases rapidly with increasing of the number of control parameters. Bigger K and M ensures 
lower difference with the reference solution, but the assimilation becomes more expensive. 
In fact, the convergence rate (i.e. average decrease of the cost function per iteration) is 
independent on the number of control parameters, but more iterations are necessary to reach 
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Figure 7. Optimal coefficients ao for m — 1 (left) and m = 2 (right) as functions of y 
experiment with the only observable variable r\. 



ih in the 



lower final value of the cost function. Thus, intermediate values of £(1000 days) obtained 
after 200 and 1000 iterations in the experiment with K = 8, M = 4 are equal to 0.15 and 
0.01 respectively, that corresponds exactly to the final values of £ in the experiments with 
K = 2, M = 1 and K = 5, M = 2. 

Restricted number of observable variables leads also to slower convergence. The number 
of iterations with only one observable variable r/ is approximately 5-8 times higher than in 
experiments assimilating data of all three variables. This is not surprising: spurious oscillations 
due to under resolved Munk layer are the most visible in the velocity v while the sea surface 
height data are assimilated. The information, hence, must be transferred to v. This takes more 
time and requires supplementary iterations. 

Concerning assimilation results with only one observable variable, we can see that final 
values of £ are close to corresponding values obtained assimilating data collected for all 
variables. Values in the left part of the Table IIIII are a little lower, but this seems to be 
due to preliminary stop of the minimization procedure. Convergence becomes so slow that 
matches the stop condition despite further decrease is still possible. 

The hypothesis of the preliminary stop is also confirmed by the analysis of obtained 
assimilation results. As well as in previous experiments, all a except ao have moved very 
little from their initial positions. Coefficients ao have been subjected to more significant 
modifications. Their values for operators acting in x coordinate in the experiment with 
K = 8, M = 4 are shown in fig [7] for m = 1 and 2. This figure can be compared with 
figlH and figEl We can see that ao,i for all operators D x u, S x u and S x rj have converged to 
very similar values as in previous experiments. Preliminary stop of the minimization procedure 
results in irregularity of ao.2 for operators D x u and S x u. Their lines have the same appearance 
as corresponding lines in figEl but they contain w some irregular noise which may represent 
the consequence of too early stop of the minimizator. 
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Consequently, the lack of observations leads to a slower convergence. Assimilating the sea 
surface elevation only, we must take into account that the computational cost of the procedure 
may be multiplied by 10. 



4. Waves 

Forced and dissipative linear shallow water model possesses a stationary solution that was 
considered in the previous section. So far, the numerical scheme under control does not depend 
on time, the use of a stationary solution represent an obvious advantage. Of course, it is easier 
to find optimal discretization when the flow is stable. 

In this section we shall look for the set of a that is optimal for a non-stationary solution 
of the shallow water model and namely inertia gravity and Rossby waves. The shallow water 
model is written without forcing nor dissipation: 



= (fo + (3y)S x S y v - gG x r\ 

dv 
di 



-(fo + Py)S y S x u-gG yV (28) 
-H (D x u + D y v) 



The system becomes hyperbolic with exact conservation of total energy and momentum. 
The mass is also conserved as well as in the forced-dissipative model. So far, there is no more 
harmonic dissipation in the problem, boundary conditions have been reformulated in order to 
ensure the existence and uniqueness of a solution: 

u(Q, y) = u{L, y) = v(x, 0) = v(x, L) = (29) 

We keep the same model's parameters as above, including interpolation and derivative 
operators, their discretizations in the interior of the domain ([3]) and near boundaries (|26[) . 
The initial state of the model (12"51) have been chosen as 



u(x,y,Q) = sin(wx/L) sui(2iry/L), v(x, y, 0) = sin(7ra;/L) sm(ny/L), 
T](x,y,0) = H + 10cos(irx/L) cos(ny/(2L)). 

Artificial "observational" data have been produced by the same model on the high resolution 
grid with 263 x 263 nodes (h = 15 km) with the second order approximation scheme in the 
interior of the domain. Assimilation experiments are performed on the 31 x 31 nodes grid 
(h = 133 km) with either second or fourth order approximation. 

As well as before, we use the norm of the difference £ (|16[) to measure the quality 
of assimilation. However, the solution now is not stationary and £ varies in time also. 
Consequently, instead of using a value of £ at some sufficiently distant moment of time, we shall 
examine the evolution of This evolution is compared with the evolution of the difference 
between the solution of the reference model on a high resolution grid and the solution of the 
model on the medium resolution (89 x 89 nodes) grid. This comparison allows us to position 
the assimilation's result. 
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Figure 8. Evolution of the difference £(t) in the assimilation experiment with different K and M with 
the second (left) and fourth (right) order approximation scheme in the interior. 



Several assimilation experiments have been carried out with different number of control 
coefficients K and M . Variable assimilation window have been used in all experiments because 
the convergence is much more slower in this case. Assimilating data when the solution is 
represented by waves, requires more iteration with a fixed long assimilation window. Even 
5000 iterations with window T — 100 days are not sufficient to move significantly from the 
initial guess point and we have to use variable windows in order accelerate the convergence 
rate. 

Solutions of the system (|28p IS composed by inertia-gravity and Rossby waves. 
Characteristical periods of the inertia-gravity wave for the given parameters and initial 
conditions is of order of 17 hours while typical period of Rossby waves is about 100 days. This 
difference in characteristic times makes additional advantages of using variable assimilation 
windows. When the window is short, of several days length, the numerical scheme is adapted 
to correspond to inertia gravity waves. And one hundred days window is more adapted to 
assimilate the information about Rossby waves. 

In this chapter we use the sequence T = 2, 6, 20, 100 days with 50 iterations made for each 
assimilation window T. Experiments with different K and M have been carried out for both 
second and fourth order approximation schemes in the interior of the domain. 

This evolution of the difference £(i) is shown in figO We plot this difference during 300 days 
time interval in order to show the evolution beyond the assimilation window and, in particular, 
to see whether obtained scheme is stable. . 

One can see that the principal error produced by the numerical scheme consists in a wrong 
wave speed of inertia-gravity waves. This error is visible in the fig[5] because there is only one 
trigonometric mode in the initial conditions of the model. The phase speed error with a single 
mode lead to the oscillating behavior of lines representing initial guess in both left and right 
figures. When the second order approximation is used, the initial guess line has a minimum 
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on the 260th day indicating that the wave on the low-resolution grid is exactly one period 
shifted with respect to the wave approximated on the high resolution grid. Using fourth order 
approximation in the interior, we get this minimum earlier, on the 85th day. 

Identification of the optimal numerical scheme at just one point near boundary does not 
help to correct the error in wave velocity. We need to control the approximation at least at two 
points near boundary. However, when M is equal or bigger than two, data assimilation lead 
to similar behavior of the solution. As it is shown in fig|8]the error in velocity is corrected by 
optimal numerical scheme for all M > 2, but only a little improvement is achieved by further 
increasing of M. When M — 7, the difference with the reference solution £ is approximately 
two times lower than this difference obtained with M = 2. Taking into account that the data 
assimilation is much more expensive with M = 7, we can state that using M = 2 is optimal in 
the experiments with inertia-gravity waves. We can note, that similar result has been obtained 
in [23j in experiments with one-dimensional wave equation. It has been shown that if the wave 
composed by multiple trigonometric modes, data assimilation with M — 1 can not correct the 
wave speed error because the the numerical scheme that could be optimal is unstable. We have 
to control the scheme at two points near boundary at least. 

Comparing assimilation results with the test solution obtained using a classical scheme on 
the medium resolution grid (89 x 89), we see the medium resolution also suffers by the error 
in the wave velocity. The error is, obviously, much smaller than in the initial guess case that 
uses a 3 times coarser grid 30 x 30. Smaller error results in a slower increase of the difference 
£. Dashed lines in figEJ that represent the difference between the solution on the medium grid 
and on the fine grid, grow slower than solid lines, but they all reach the same maximal values. 

So far the error in the velocity has been corrected for the solution on the coarse grid with 
optimal discretization near boundary, on long time scales this solution is closer to the reference 
solution than the solution obtained on the medium grid. 

Modifications that have been made in the discretizations of interpolation operators and 
derivatives near boundary are not the same as in the case of the stationary solution. Contrary 
to results of experiments in the previous section, coefficients ao have not moved from their 
initial guesses ao = for all operators. Principal modifications have been applied to the 
discretizations them-self, i.e. to a.k,m '■ k =/= 0. 

Taking into account that experiments in this section have been performed with sufficiently 
large numbers of control coefficients K and it is difficult to understand what global effect 
brings each modification, we consider the coefficients of the Taylor expansion. For each linear 
combination like (|26[) that represent an approximation of the interpolation operator or a 
derivative, we calculate coefficients of the three first terms of the Taylor expansion: 



{D x u) m/2 = ^2^ a k,rn U fc-l/2 = — « L/2 +72,m^ 



fc=l 
K 



i/2 OX 



{S x u) m/2 = ^ a k*m U k-l/2 = 11, mU L/2 +l2, m h — 



fc=l 



m/2 



d 2 u 
dx 



,/2 



m/2 



(30) 



K 



U-l 

Coefficients 7 are obtained simply as 7;,™, = 2^ i\ a k,m These coefficients of the Taylor 

k— 1 

expansion calculated for a that have been obtained in the experiment with K = 5, M = 2 are 
shown in the Tabic [TVl 
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Table IV. Coefficients of the Taylor expansion calculated in the experiment with K = 5, M — 2. 







Second order 






Fourth order 




Operator 


m 


7i 


72 


73 


7i 


72 


73 


(Dili) 


1 


0.0463 


1.0490 


0.0393 


0.0191 


1.1128 


0.2183 


{D x u) 


2 


-0.0137 


0.8763 


0.1620 


-0.0375 


1.0366 


0.0678 


(D x p) 


1 


0.0020 


1.4587 


-0.6313 


-0.0048 


1.2669 


0.3334 


(D x p) 


2 


0.0041 


1.2567 


0.0930 


-0.0065 


1.0930 


0.2289 


(DyV) 


1 


0.1225 


0.6699 


0.5234 


0.0626 


1.0514 


0.0667 


(D y v) 


2 


-0.1252 


1.3678 


-0.3923 


-0.0145 


0.9344 


-0.0462 


(Dy P ) 


1 


-0.0019 


1.3599 


0.2992 


0.0013 


0.9191 


-0.2578 


(Dy P ) 


2 


0.0475 


1.6242 


0.0138 


0.0123 


0.1542 


-0.4943 


{S x u) 


1 


1.2711 


0.4103 


0.5558 


1.2049 


-0.3784 


0.5495 


(S x p) 


1 


0.8935 


0.0345 


0.0006 


0.9211 


-0.2167 


-0.1967 


(S xP ) 


2 


0.7497 


0.1913 


-0.1252 


0.7632 


0.1147 


-0.3810 


(SyV) 


1 


1.1307 


-0.1101 


0.1195 


1.1312 


-0.0580 


-0.0097 


(SyV) 


2 


1.1259 


-0.0702 


0.2285 


1.0686 


0.0113 


0.0139 


(SyP) 


2 


1.5921 


-1.2996 


1.2440 


0.8134 


0.0573 


0.0728 



Taylor expansions show that obtained expressions do not approximate corresponding 
operators. So far, the value of 71 is never zero, the derivative near boundary is always 
approximated with the minus hrst order due to the presence ^1 in it's Taylor expansion. 
Values of 71 may be as high as 0.12, that is comparable with the coefficient in front of the first 
derivative 72, which is, in turn, may be as low as 0.15 and as high as 1.6. 

The first coefficient of the Taylor expansion of the interpolation near boundary varies also 
in a wide range between 0.7 and 1.6. That means interpolation multiplies simultaneously the 
function by some value which is different from 1. 

Similar phenomenon have been seen in [23j in experiments with numerically approximated 
waves propagating with a wrong wave velocity. The data assimilation and control of the 
discretization of the derivatives near the boundary can not modify numerical wave velocity. 
The only way for this control to get a better solution consists in modifying the size of the 
domain by deforming the grid cell adjacent to boundary. The boundary cell of the domain 
is adapted by data assimilation to ensure the wave with numerical velocity propagates the 
modified interval in the same time that the exact wave propagates the exact interval. So far, 
the control can not correct the error in the wave velocity, it commits another error in the 
interval size in order to compensate the first one. 



5. Conclusion 

The purpose of this paper is to study the variational data assimilation procedure applied for 
identification of the optimal paramctrization of derivatives and interpolation operators near 
the boundary on the example of a linear shallow water model. 

Comparing this procedure with now well developed data assimilation intended to identify 
optimal initial data, we can say there are both common points and differences as well. 

Tangent iJTJ and adjoint (Tl"3"|) models are composed of two types of operators: operators 
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acting in the space of perturbations of 8u, Sv and Sr], and operators acting on the variations 
of the control coefficients 5a directly. The first type, that constitute the 3x3 block in the 
matrix of ([5]), governs the evolution of a small perturbation by the model's dynamics. This 
term is common for any data assimilation no matter what parameter we want to identify. The 
second type, ©, determines the way how the uncertainty is introduced into the model. So, 
if we intend to identify an optimal discretization of operators near the boundary for a model 
with an existing adjoint developed for data assimilation and identification of initial point, we 
can use this adjoint as 3 x 3 block in the matrix of {H} because this part is common for any 
data assimilation. However, the fourth column of this matrix ([9]) must be developed from the 
beginning because this column is specific for the particular control parameter. As we have seen, 
the development of the fourth column is as complex as the development of the 3x3 block 
even for a simple linear model. For a non-linear shallow-water model, the development of the 
fourth column will become even more complex because non-linear terms contribute more to 
operators responsible for introducing of uncertainty into the model. 

Using automatic code differentiation for generation of the adjoint model may also become 
more complex in the case of controlling the model's internal parameters. The adjoint model 
is not standard, the model must be rewritten in order to satisfy the automatic differentiation 
requirements. 

Another difference consists in the number of control parameters. The dimension of initial 
point of the model is usually equal to the dimension of the model's state variable. Contrary to 
this, when we control boundary parametrization, the dimension of control variables is much 
lower than the dimension of the model's variable because the dimension of the control is 
proportional to the length of the boundary of the domain, while the dimension of the model's 
state relates to the area of the domain. That means the quantity of controlled parameters 
and the dimension of the gradient of cost function may be much lower than the quantity of 
variables in the models state. 

Taking into account mentioned technical difficulties in development of the adjoint, it may be 
reasonable to try to calculate the gradient by some other method beginning with the simplest 
finite difference method. 

As we have seen in this paper, controlling the discretization of operators in the boundary 
region allows to obtain better results than controlling boundary conditions of the model. The 
possibility to control more than two parameters, as it is the case when boundary conditions are 
controlled, allows us to get lower difference with the reference model. Enlarging the boundary 
region, where derivatives and interpolations are controlled, allows us to obtain the solution 
on the coarse grid which is closer to the reference one than the solution on the 3 times finer 
grid. This is valid as for the stationary solution with under-resolved Munk layer, and for the 
non-stationary solution representing inertia-gravity and Rossby waves. 

In the case of the stationary solution with Munk layer, only one point near the boundary 
may be controlled allowing to avoid spurious oscillations due to under-resolved boundary layer. 
To obtain the solution almost indistinguishable from the reference one, we have to enlarge 
boundary region and to control operators at 4 nodes near boundary. 

Uniform control of all coefficients a all the long the boundary is not the best choice. Mass 
and momentum fluxes can not be balanced on the boundary and the assimilation procedure 
have to look for the optimum in the domain where the scheme is instable. Consequently, at 
least «o should be allowed to be variable and particular in each point near the boundary in 
order to allow the flux across the boundary to be compensated. This requirement substantially 
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increases the dimension of the control variables, but it is necessary to obtain a stable scheme: 
we must at least allow control coefficients cxq to depend on coordinate: on the latitude y 
corresponding to the index j for operators S x , D x and G x , and on the longitude x or index i 
for operators S y , D y and G y . 

Concerning the choice of the assimilation window, we must take into account that 
assimilation using short windows may provide an unstable scheme while using long windows 
is computationally more expensive. Moreover, better precision we want, longer windows we 
must use. We have seen in the experiment with the stationary solution that 10 iterations 
are sufficient for the minimization to converge when T = 0.2 days resulting, however, in an 
unstable scheme. When assimilation window T = 2 days is used, ten iterations are no longer 
sufficient for the minimization to converge. We need at least 50 iterations to bring the cost 
funtion close to the minimum and to keep a stable scheme. Performing 150 supplementary 
iterations we win a little in the cost function but the scheme looses the stability. Assimilation 
window T — 20 days ensures the stability of the scheme during 1000 iterations, but after that 
we obtain an instable scheme also. 

To combine rapid convergence of the assimilation with short T and the stability of obtained 
scheme with long T we may use variable assimilation windows. Performing few iterations with 
short T we obtain the set of coefficients a that provides low cost function value. This set is 
corrected by assimilation with longer T in order to ensure the stability of the scheme. 

When we vary the number of control parameters K and M, we see that parameter K 
influences only a little the assimilation quality. The difference between the solution produced 
by the model with optimal discretization of operators in the boundary region and the reference 
solution is very little sensitive to the number of coefficients a used in the approximation of 
derivatives and interpolations near the boundary. On the other hand, the difference £(t) is 
sensitive to M, that is to the number of grid nodes near boundary in which the discretization 
is controlled by data assimilation. In other words, to the width of the boundary region. We have 
seen, despite under-resolution of the Munk layer by the model's grid, optimal discretization 
allows us to obtain the solution as close to the reference one as we want. It is sufficient to take 
a sufficient M. 

In the experiment with waves, parameter M influences also the assimilation results. One 
have to take M at least equal to 2 in order to compensate the error in wave velocities. 

It has been shown also that the sea surface height may be used as the only observable 
variable. It is possible to reconstruct discretization of all operators using observations of r\ only. 
However, assimilation in this case becomes 5 times more expensive due to slower convergence 
of the minimization procedure. 

Analyzing coefficients a obtained in assimilation, we can note that optimal parametrization 
of interpolations and derivatives near the boundary may approximate nothing in classical sense, 
i.e. it may not be valid for an arbitrary function. We have seen here that coefficients 7 of the 
Taylor expansion of obtained expressions may be as high as 1.6 and as low as 0.15 instead of 
1. Moreover, the approximation of the derivative has minus first order. Hence, we must take 
into account that coefficients found by data assimilation do not approximate interpolations or 
derivatives in general. They are valid for given model's parameters only. 

Thus, there exist a possibility to apply data assimilation technique to identification of 
optimal discretizations of the model's operators in the boundary region. Optimal discretization 
allows us to correct such errors of the numerical scheme as under-resolved boundary layer and 
wrong wave velocity. 
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Convergence of the Cost function 
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